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ABSTRACT We identified a strain of betacoronavirus BtKY72/Rhinolophus sp./Kenya/ 
2007 (here BtKY72) from rectal swab samples in Kenyan bats. This paper reports the 
complete genomic sequence of BtKY72, which is closely related to BtCoV/BM48-31/ 
Bulgaria/2008, a severe acute respiratory syndrome (SARS)-related virus from Rhinolo- 
phus bats in Europe. 

T he 2002 and 2003 outbreak of severe acute respiratory syndrome coronavirus 
(SARS-CoV) infection was a significant public health threat at the beginning of the 
21st century (1-6). Initial identification of SARS-CoV in civet cats and other wild animals 
in live animal markets suggests zoonosis (7). Later, Rhinolophus sp. bats were identified 
as harboring severe acute respiratory syndrome-related CoV at high frequencies and 
were believed to be a natural reservoir host for SARS-CoV (8, 9). 

During a 5-year bat coronavirus (CoV) surveillance study (2006 to 2010) in Kenya, we 
identified five bat betacoronaviruses by pan-CoV reverse transcription-PCR (RT-PCR) 
from fecal samples of Chaerephon and Rhinolophus bats (10, 11). The Institutional 
Animal Care and Use Committee (IACUC) of the Centers for Disease Control and 
Prevention and Kenya Wildlife Services approved all protocols related to the animal 
experiments in this study. These bat betacoronaviruses shared >98% nucleotide 
identity with each other and were clustered with other known bat SARS-related CoVs 
identified from Rhinolophus bats in China and Europe (8, 9, 12-15) based on a short 
amplicon sequence of open reading frame 1 b (ORF1 b) (121 bp). We selected RNA from 
the BtKY72 bat, which was one of the five betacoronavirus-positive bats from a previous 
study (11), for full genome sequencing. To determine the full genome sequence, 
consensus degenerate primers were designed from conserved sequences based on all 
known SARS-related CoVs (Table 1). Several small islands of sequences scattered 
throughout the genome were first determined from a Kenyan Rhinolophus bat using 
sets of seminested or nested consensus RT-PCR primers by Sanger sequencing. Then, 
sets of sequence-specific primers were used to fill the gaps and generate the full 
genome sequence, named BtKY72 /Rhinolophus sp./Kenya/2007 (Table 1). The 5' and 3' 
ends of genome sequences were determined using a 573' rapid amplification of cDNA 
ends (RACE) kit (Roche). Complete genome sequencing was not performed due to 
limited viral loads in fecal samples from the other four betacoronavirus-positive bats. 

The genome of BtKY72 was 29,259 nucleotides long, including the poly(A) tail, with 
39% G+C content. Sequence alignment and a BLAST search analysis of the full-length 
genome sequences showed that the BtKY72 genome shared an 81% overall nucleotide 
identity to its nearest relative, BtCoV/BM48-3, which was identified from a Rhinolophus 
bat in Europe (15), and that it has 93 to 94% amino acid identity in the seven 
concatenated, conserved replicase domains (ADP-ribose-1"-phosphatase [ADRP], non- 
structural protein 5 [nsp5], and nsp12 to nsp16) to BtCoV/BM48-31 (Fig. 1). Phyloge¬ 
netic analysis suggested that BtKY72 belongs to the subgenus Sarbecovirus of the 
genus Betacoronavirus (Fig. 1). The genome organization contained the following gene 
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AY274119 SARS CoV Tor2 
AY278488 SARS CoV BJ01 
AY613950 SARS CoV PC4-227 
AY304486 SARS CoV SZ3 

KY417146 SARS related CoV Rs4231 CHN Rhinolophus sinicus 
KY417151 SARS related CoV Rs7327 CHN Rhinolophus sinicus 
KC881005 SARS related CoV RsSHC014 CHN Rhinolophus sinicus 
KC881006 SARS related CoV Rs3367 CHN Rhinolophus sinicus 

- KJ473816 SARS related CoV YN2013 CHN Rhinolophus sinicus 
FJ588686 SARS related CoV Rs672/2006 CHN Rhinolophus sinicus 
KY417143 SARS related CoV Rs4081 CHN Rhinolophus sinicus 
KY417142 SARS related CoV As6526 CHN Aselliscus stoliczkanus 

- KP886809 SARS related CoV YNLF 34C CHN Rhinolophus Ferrumequinum 
DQ071615 SARS related CoV Rp3 CHN Rhinolophus pearsoni 
KJ473815 SARS related CoV GX2013 CHN Rhinolophus sinicus 
JX993988 SARS related CoV Yunnan2011 CHN Chaerephon plicata 

I— DQ412042 SARS related CoV Rfl CHN Rhinolophus ferrumequinum 
DQ412043 SARS related CoV Rml CHN Rhinolophus macrotis 
KJ473814 SARS related CoV HuB2013 CHN Rhinolophus sinicus 
JX993987 SARS related CoV Shaanxi2011 CHN Rhinolophus pusillus 
L DQ022305 SARS related CoV HKU3-1 CHN Rhinolophus sinicus 

MG772933 SARS related CoV bat-SL-CoVZC45 CHN Rhinolophus sinicus 
• KY352407 SARS related CoV BtKY72 KEN Rhinolophus sp. 

100 I-NC014470 Bat CoV BM48-31/BGR/2008 BGR Rhinolophus blasii 

- HQ166910 SARS related CoV ZBCoV NGA Hipposideros commersoni 

- KF636752 Bat CoV Zhejiang2013 CHN Hipposideros pratti 
KJ473821 Bat CoV SC2013 CHN Vespertilio superans 

NC019843 MERS-CoV 

NC009020 Bat CoV HKU5-1 CHN Pipistrellus abramus 
- NC009019 Bat CoV HKU4-1 CHN Tylonycteris pachypus 

-NC009021 Bat CoV HKU9-1 CHN Rousettus lechenaulti 

-NC030886 Bat CoV GCCDC1 CHN Rousettus leschenaulti 

E k, C005147 HCoV-OC43 
NC001846 MHV 
- NC006577 HCoV-HKUl 
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FIG 1 Phylogenetic analysis of whole-genome sequences of betacoronaviruses. The phylogenetic tree is inferred using the maximum likelihood (ML) method 
available in PhyML version 3.0 (16), assuming a general time-reversible (GTR) model with a discrete gamma-distributed rate variation among sites (TJ and a 
subtree pruning and regrafting (SPR) tree-swapping algorithm. The sequences are labeled with accession number, strain name, geographic (three-letter country 
code), and host (species) information. BtKY72/Rhinolophus sp./Kenya/2007, sequenced in this study, is highlighted with a solid circle. The genus taxonomy 
information is shown to the right side of the phylogeny. The maximum likelihood bootstrap is indicated next to the nodes. The scale bar indicates the estimated 
number of nucleotide substitutions per site. KEN, Kenya; CHN, China; BGR, Bulgaria; NGA, Nigeria; MERS-CoV, Middle East respiratory syndrome coronavirus; 
HCoV, human coronavirus; MHV, mouse hepatitis virus; ZBCoV, Zaria bat coronavirus. 


order: 5' UTR-ORF1ab-S-ORF3a-E-M-ORF6-ORF7a-ORF7b-N-3' UTR. Unlike SARS-CoV 
and other known SARS-CoV-related bat viruses, both ORF3b and ORF8 were absent in 
BtKY72. ORF8 was also missing in its closest neighbor, BtCoV/BM48-31 (15). 

In conclusion, our study demonstrates that the SARS-related CoVs that were iden¬ 
tified from Rhinolophus bats in China and Europe were also present in Kenyan Rhinolo¬ 
phus bats (Fig. 1). The discovery of SARS-related CoVs in Kenyan bats adds to the 
diversity and geographic range of CoVs in Rhinolophus bats. The genome data for 
BtKY72 will facilitate understanding of the molecular evolutionary characteristics of bat 
SARS-related CoV. 

Data availability. The complete genome sequence of BtKY72 is available in Gen- 
Bank under the accession number KY352407. 
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